-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
WIP: PERF: Cythonize fillna #42309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: PERF: Cythonize fillna #42309
Conversation
pandas/core/internals/blocks.py
Outdated
# TODO: This seems to work for EAS, verify it does | ||
return [ | ||
self.make_block_same_class( | ||
values=self.values.fillna(value=value, limit=limit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you'll probably need to make NDArrayBackedExtensionArray.fillna use the new cython function(s)
# TODO: Verify EA case | ||
if is_extension_array_dtype(value): | ||
mask = value.isna() | ||
value = np.asarray(value[mask], dtype=object) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
going straight to object may be unnecessarily heavy for e.g. Categorical value
.
pandas/core/internals/blocks.py
Outdated
mask = value.isna() | ||
value = np.asarray(value[mask], dtype=object) | ||
libalgos.fillna1d_multi_values( | ||
arr.values[mask], value=value, limit=limit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is going to fill arr.values[mask] inplace, but i dont think arr.values[mask] shares data with arr.values?
My main advice is to not worry about the Block stuff for now, just get the cython code working and write tests for it directly with ndarrays (and NDArrayBackedExtensionArray potentiall) |
@jbrockmendel Is it OK to do the NDArrayBacked.fillna changes in a followup, since this PR is already getting somewhat large? |
Good idea, yes. |
This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this. |
Appears this PR has been dormant for a while closing to clear the queue. Feel free to reopen when you have time to work on this PR. |
Code is still pretty messy at this point, opening mainly for CI purposes. Also, this is my first time messing around with BlockManager so would appreciate a little extra scrutiny on that part.
Benchmarks
SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.
before after ratio
[9f6a91a] [77c0563]
SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE DECREASED.
SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.